Case study on Moneyball
Bill James was the originator of the sabermetrics, the approach of using data to predict what outcomes best predicted if a team would win.
The goal of baseball game is to score more runs, than the other team.
Each team has 9 batters who have an opportunity to hit a ball with a bat in a predetermined order.
Each time a batter has an opportunity to bat, we call it a plate appearance (PA).
The PA ends with a binary outcome: the batter either makes an out (failure) and returns to the bench or the batter doesn’t (success) and can run around the bases, and potentially score a run (reach all 4 bases).
There are five ways a batter can succeed (not make an out):
Bases on balls (BB): the pitcher fails to throw the ball through a predefined area considered to be hittable (the strike zone), so the batter is permitted to go to first base.
Single: the batter hits the ball and gets to first base.
Double (2B): the batter hits the ball and gets to second base.
Triple (3B): the batter hits the ball and gets to third base.
Home Run (HR): the batter hits the ball and goes all the way home and scores a run.
- Historically, the batting average has been considered the most important offensive statistic. To define this average, we define a hit (H) and an at bat (AB). Singles, doubles, triples and home runs are hits. The fifth way to be successful, a walk (BB), is not a hit. An AB is the number of times you either get a hit or make an out; BBs are excluded. The batting average is simply H/AB and is considered the main measure of a success rate.
The visualization of choice when exploring the relationship between two variables like home runs and runs is a scatterplot.

What is the aplication of statistics and data science to baseball called?
Sabermetrics
What is the outcome is not included in the batting average?
A base on balls
Why do we consider team statistics as well as individual player statistcs?
Team statistics are important because the success of individual players depends also on the strength of their team.
You want to know whether teams with more at-bats per game have more runs per game.

Load the Lahman library. Filter the Teams data frame to include years from 1961 to 2001. Make a scatterplot of runs per game versus at bats (AB) per game.
Use the filtered Teams data frame from Question 6. Make a scatterplot of win rate (number of wins per game) versus number of fielding errors (E) per game.
Which of the following is true?
When you examine the scatterplot above, you can see a clear trend towards decreased win rate with increasing number of errors per game.
Use the filtered Teams data frame from Question 6. Make a scatterplot of triples (X3B) per game versus doubles (X2B) per game.
LS0tDQp0aXRsZTogIlJlZ3Jlc3Npb24gTGVjdHVyZXMiDQpvdXRwdXQ6DQogIHBkZl9kb2N1bWVudDogZGVmYXVsdA0KICBodG1sX25vdGVib29rOiBkZWZhdWx0DQogIHdvcmRfZG9jdW1lbnQ6IGRlZmF1bHQNCi0tLQ0KDQoNCmBgYHtyIHNldHVwLCBpbmNsdWRlPUZBTFNFfQ0Ka25pdHI6Om9wdHNfY2h1bmskc2V0KGVjaG8gPSBUUlVFKQ0Kb3B0aW9ucyhkaWdpdHMgPSA0KQ0KbGlicmFyeShMYWhtYW4pDQpsaWJyYXJ5KHRpZHl2ZXJzZSkNCmxpYnJhcnkoZ2dFeHRyYSkNCmxpYnJhcnkoZHNsYWJzKQ0KbGlicmFyeShwbG90bHkpDQpsaWJyYXJ5KGNvcnJwbG90KQ0KZHNfdGhlbWVfc2V0KCkNCmBgYA0KDQojIyMgQ2FzZSBzdHVkeSBvbiBNb25leWJhbGwNCg0KKiBCaWxsIEphbWVzIHdhcyB0aGUgb3JpZ2luYXRvciBvZiB0aGUgKipzYWJlcm1ldHJpY3MqKiwgdGhlIGFwcHJvYWNoIG9mIHVzaW5nIGRhdGEgdG8gcHJlZGljdCB3aGF0IG91dGNvbWVzIGJlc3QgcHJlZGljdGVkIGlmIGEgdGVhbSB3b3VsZCB3aW4uDQoNCiogVGhlIGdvYWwgb2YgYmFzZWJhbGwgZ2FtZSBpcyB0byBzY29yZSBtb3JlIHJ1bnMsIHRoYW4gdGhlIG90aGVyIHRlYW0uDQoNCiogRWFjaCB0ZWFtIGhhcyA5IGJhdHRlcnMgd2hvIGhhdmUgYW4gb3Bwb3J0dW5pdHkgdG8gaGl0IGEgYmFsbCB3aXRoIGEgYmF0IGluIGEgcHJlZGV0ZXJtaW5lZCBvcmRlci4NCg0KKiBFYWNoIHRpbWUgYSBiYXR0ZXIgaGFzIGFuIG9wcG9ydHVuaXR5IHRvIGJhdCwgd2UgY2FsbCBpdCBhIHBsYXRlIGFwcGVhcmFuY2UgKFBBKS4NCg0KKiBUaGUgUEEgZW5kcyB3aXRoIGEgYmluYXJ5IG91dGNvbWU6IHRoZSBiYXR0ZXIgZWl0aGVyIG1ha2VzIGFuIG91dCAoZmFpbHVyZSkgYW5kIHJldHVybnMgdG8gdGhlIGJlbmNoIG9yIHRoZSBiYXR0ZXIgZG9lc27igJl0IChzdWNjZXNzKSBhbmQgY2FuIHJ1biBhcm91bmQgdGhlIGJhc2VzLCBhbmQgcG90ZW50aWFsbHkgc2NvcmUgYSBydW4gKHJlYWNoIGFsbCA0IGJhc2VzKS4NCg0KKiBUaGVyZSBhcmUgZml2ZSB3YXlzIGEgYmF0dGVyIGNhbiBzdWNjZWVkIChub3QgbWFrZSBhbiBvdXQpOg0KDQoxLiBCYXNlcyBvbiBiYWxscyAoQkIpOiB0aGUgcGl0Y2hlciBmYWlscyB0byB0aHJvdyB0aGUgYmFsbCB0aHJvdWdoIGEgcHJlZGVmaW5lZCBhcmVhIGNvbnNpZGVyZWQgdG8gYmUgaGl0dGFibGUgKHRoZSBzdHJpa2Ugem9uZSksIHNvIHRoZSBiYXR0ZXIgaXMgcGVybWl0dGVkIHRvIGdvIHRvIGZpcnN0IGJhc2UuDQoNCjIuIFNpbmdsZTogdGhlIGJhdHRlciBoaXRzIHRoZSBiYWxsIGFuZCBnZXRzIHRvIGZpcnN0IGJhc2UuDQoNCjMuIERvdWJsZSAoMkIpOiB0aGUgYmF0dGVyIGhpdHMgdGhlIGJhbGwgYW5kIGdldHMgdG8gc2Vjb25kIGJhc2UuDQoNCjQuIFRyaXBsZSAoM0IpOiB0aGUgYmF0dGVyIGhpdHMgdGhlIGJhbGwgYW5kIGdldHMgdG8gdGhpcmQgYmFzZS4NCg0KNS4gSG9tZSBSdW4gKEhSKTogdGhlIGJhdHRlciBoaXRzIHRoZSBiYWxsIGFuZCBnb2VzIGFsbCB0aGUgd2F5IGhvbWUgYW5kIHNjb3JlcyBhIHJ1bi4NCg0KKiBIaXN0b3JpY2FsbHksIHRoZSBiYXR0aW5nIGF2ZXJhZ2UgaGFzIGJlZW4gY29uc2lkZXJlZCB0aGUgbW9zdCBpbXBvcnRhbnQgb2ZmZW5zaXZlIHN0YXRpc3RpYy4gVG8gZGVmaW5lIHRoaXMgYXZlcmFnZSwgd2UgZGVmaW5lIGEgaGl0IChIKSBhbmQgYW4gYXQgYmF0IChBQikuIFNpbmdsZXMsIGRvdWJsZXMsIHRyaXBsZXMgYW5kIGhvbWUgcnVucyBhcmUgaGl0cy4gVGhlIGZpZnRoIHdheSB0byBiZSBzdWNjZXNzZnVsLCBhIHdhbGsgKEJCKSwgaXMgbm90IGEgaGl0LiBBbiBBQiBpcyB0aGUgbnVtYmVyIG9mIHRpbWVzIHlvdSBlaXRoZXIgZ2V0IGEgaGl0IG9yIG1ha2UgYW4gb3V0OyBCQnMgYXJlIGV4Y2x1ZGVkLiAqKlRoZSBiYXR0aW5nIGF2ZXJhZ2UgaXMgc2ltcGx5IEgvQUIgYW5kIGlzIGNvbnNpZGVyZWQgdGhlIG1haW4gbWVhc3VyZSBvZiBhIHN1Y2Nlc3MgcmF0ZS4qKg0KDQpUaGUgdmlzdWFsaXphdGlvbiBvZiBjaG9pY2Ugd2hlbiBleHBsb3JpbmcgdGhlIHJlbGF0aW9uc2hpcCBiZXR3ZWVuIHR3byB2YXJpYWJsZXMgbGlrZSBob21lIHJ1bnMgYW5kIHJ1bnMgaXMgYSAqKnNjYXR0ZXJwbG90KiouDQoNCmBgYHtyIHBsb3RzLCBlY2hvPUZBTFNFLCBtZXNzYWdlPUZBTFNFLCB3YXJuaW5nPUZBTFNFLCBwYWdlZC5wcmludD1GQUxTRX0NCg0KDQoNClRlYW1zICU+JSBmaWx0ZXIoeWVhcklEICVpbiUgMTk2MToyMDAxKSAlPiUNCiAgICBtdXRhdGUoSFJfcGVyX2dhbWUgPSBIUiAvIEcsIFJfcGVyX2dhbWUgPSBSIC8gRykgJT4lDQogICAgZ2dwbG90KGFlcyhIUl9wZXJfZ2FtZSwgUl9wZXJfZ2FtZSkpICsgDQogICAgZ2VvbV9wb2ludChhbHBoYSA9IDAuNSkNCmBgYA0KDQpgYGB7ciBwbG90czIsIGluY2x1ZGU9RkFMU0V9DQoNCiNTY2F0dGVycGxvdCBvZiB0aGUgcmVsYXRpb25zaGlwIGJldHdlZW4gc3RvbGVuIGJhc2VzIGFuZCB3aW5zDQpUZWFtcyAlPiUgZmlsdGVyKHllYXJJRCAlaW4lIDE5NjE6MjAwMSkgJT4lDQogICAgbXV0YXRlKFNCX3Blcl9nYW1lID0gU0IgLyBHLCBSX3Blcl9nYW1lID0gUiAvIEcpICU+JQ0KICAgIGdncGxvdChhZXMoU0JfcGVyX2dhbWUsIFJfcGVyX2dhbWUpKSArIA0KICAgIGdlb21fcG9pbnQoYWxwaGEgPSAwLjUpDQpgYGANCg0KYGBge3IgcGxvdHMzLCBpbmNsdWRlPUZBTFNFfQ0KDQojU2NhdHRlcnBsb3Qgb2YgdGhlIHJlbGF0aW9uc2hpcCBiZXR3ZWVuIGJhc2VzIG9uIGJhbGxzIGFuZCBydW5zDQpUZWFtcyAlPiUgZmlsdGVyKHllYXJJRCAlaW4lIDE5NjE6MjAwMSkgJT4lDQogICAgbXV0YXRlKEJCX3Blcl9nYW1lID0gQkIgLyBHLCBSX3Blcl9nYW1lID0gUiAvIEcpICU+JQ0KICAgIGdncGxvdChhZXMoQkJfcGVyX2dhbWUsIFJfcGVyX2dhbWUpKSArIA0KICAgIGdlb21fcG9pbnQoYWxwaGEgPSAwLjUpDQpgYGANCg0KKiBRdWVzdGlvbjENCg0KV2hhdCBpcyB0aGUgYXBsaWNhdGlvbiBvZiBzdGF0aXN0aWNzIGFuZCBkYXRhIHNjaWVuY2UgdG8gYmFzZWJhbGwgY2FsbGVkPw0KDQoqU2FiZXJtZXRyaWNzKg0KDQoqIFF1ZXN0aW9uMg0KDQpXaGF0IGlzIHRoZSBvdXRjb21lIGlzIG5vdCBpbmNsdWRlZCBpbiB0aGUgYmF0dGluZyBhdmVyYWdlPw0KDQoqQSBiYXNlIG9uIGJhbGxzKg0KDQoqIFF1ZXN0aW9uMw0KDQpXaHkgZG8gd2UgY29uc2lkZXIgdGVhbSBzdGF0aXN0aWNzIGFzIHdlbGwgYXMgaW5kaXZpZHVhbCBwbGF5ZXIgc3RhdGlzdGNzPw0KDQpUZWFtIHN0YXRpc3RpY3MgYXJlIGltcG9ydGFudCBiZWNhdXNlIHRoZSBzdWNjZXNzIG9mIGluZGl2aWR1YWwgcGxheWVycyBkZXBlbmRzIGFsc28gb24gdGhlIHN0cmVuZ3RoIG9mIHRoZWlyIHRlYW0uDQoNCiogUXVlc3Rpb240DQoNCllvdSB3YW50IHRvIGtub3cgd2hldGhlciB0ZWFtcyB3aXRoIG1vcmUgYXQtYmF0cyBwZXIgZ2FtZSBoYXZlIG1vcmUgcnVucyBwZXIgZ2FtZS4NCg0KYGBge3IgZWNobz1GQUxTRX0NCg0KDQpwIDwtIFRlYW1zICU+JSBmaWx0ZXIoeWVhcklEICVpbiUgMTk2MToyMDAxICkgJT4lDQogICAgbXV0YXRlKEFCX3Blcl9nYW1lID0gQUIvRywgUl9wZXJfZ2FtZSA9IFIvRykgJT4lDQogICAgZ2dwbG90KGFlcyhBQl9wZXJfZ2FtZSwgUl9wZXJfZ2FtZSkpICsgDQogICAgZ2VvbV9wb2ludChhbHBoYSA9IDAuNSkgKw0KICAgIGxhYnMoeCA9ICdhdC1iYXRzIHBlciBnYW1lJywgeSA9ICdSdW5zIHBlciBnYW1lJywgdGl0bGUgPSAnUmVsYXRpb25zaGlwIChhdC1iYXRzIHggcnVucykgcGVyIGdhbWUnICkNCg0KDQpwMSA8LSBnZ01hcmdpbmFsKHAsIHR5cGUgPSAiaGlzdG9ncmFtIiwgY29sb3IgPSAnYmxhY2snLCBmaWxsID0gJ3B1cnBsZScpDQoNCnAxDQpgYGANCg0KKiBRdWVzdGlvbjYNCg0KTG9hZCB0aGUgTGFobWFuIGxpYnJhcnkuIEZpbHRlciB0aGUgVGVhbXMgZGF0YSBmcmFtZSB0byBpbmNsdWRlIHllYXJzIGZyb20gMTk2MSB0byAyMDAxLiBNYWtlIGEgc2NhdHRlcnBsb3Qgb2YgcnVucyBwZXIgZ2FtZSB2ZXJzdXMgYXQgYmF0cyAoQUIpIHBlciBnYW1lLg0KDQpgYGB7ciBwbG90NSwgZWNobz1GQUxTRX0NCg0KcCA8LSBUZWFtcyAlPiUgDQogICAgIGZpbHRlcih5ZWFySUQgJWluJSAxOTYxOjIwMDEpICU+JQ0KICAgICBtdXRhdGUocnVuc19wZXJfZ2FtZSA9IFIvRywgQUJfcGVyX2dhbWUgPSBBQi9HKSAlPiUNCiAgICAgZ2dwbG90KGFlcyhydW5zX3Blcl9nYW1lLCBBQl9wZXJfZ2FtZSkpICsNCiAgICAgZ2VvbV9wb2ludChhbHBoYSA9IDAuNSkgKw0KICAgICBnZW9tX3Ntb290aChtZXRob2QgPSBsbSwgY29sb3IgPSAicmVkIiwgc2UgPSBGQUxTRSkNCmdncGxvdGx5KHApDQpgYGANCg0KKiBRdWVzdGlvbjcNCg0KVXNlIHRoZSBmaWx0ZXJlZCBUZWFtcyBkYXRhIGZyYW1lIGZyb20gUXVlc3Rpb24gNi4gTWFrZSBhIHNjYXR0ZXJwbG90IG9mIHdpbiByYXRlIChudW1iZXIgb2Ygd2lucyBwZXIgZ2FtZSkgdmVyc3VzIG51bWJlciBvZiBmaWVsZGluZyBlcnJvcnMgKEUpIHBlciBnYW1lLg0KDQoNCldoaWNoIG9mIHRoZSBmb2xsb3dpbmcgaXMgdHJ1ZT8NCg0KYGBge3IgcTcsIGVjaG89RkFMU0V9DQoNCnAgPC0gVGVhbXMgJT4lIA0KICAgICBmaWx0ZXIoeWVhcklEICVpbiUgMTk2MToyMDAxKSAlPiUNCiAgICAgbXV0YXRlKHdpbl9yYXRlID0gVy9HLCBFX3JhdGUgPSBFL0cpICU+JQ0KICAgICBnZ3Bsb3QoYWVzKHdpbl9yYXRlLCBFX3JhdGUpKSArDQogICAgIGdlb21fcG9pbnQoYWxwaGEgPSAwLjUpICsNCiAgICAgbGFicyh4ID0gJ1dpbnMgcGVyIGdhbWUnLCB5ID0gJ0ZpZWxkaW5nIGVycm9ycyBwZXIgZ2FtZScpICsNCiAgICAgZ2VvbV9zbW9vdGgobWV0aG9kID0gbG0sIGNvbG9yID0gInJlZCIsIHNlID0gRkFMU0UpDQpnZ3Bsb3RseShwKQ0KYGBgDQoNCldoZW4geW91IGV4YW1pbmUgdGhlIHNjYXR0ZXJwbG90IGFib3ZlLCB5b3UgY2FuIHNlZSBhIGNsZWFyIHRyZW5kIHRvd2FyZHMgZGVjcmVhc2VkIHdpbiByYXRlIHdpdGggaW5jcmVhc2luZyBudW1iZXIgb2YgZXJyb3JzIHBlciBnYW1lLg0KDQoqIFF1ZXN0aW9uOA0KDQpVc2UgdGhlIGZpbHRlcmVkIFRlYW1zIGRhdGEgZnJhbWUgZnJvbSBRdWVzdGlvbiA2LiBNYWtlIGEgc2NhdHRlcnBsb3Qgb2YgdHJpcGxlcyAoWDNCKSBwZXIgZ2FtZSB2ZXJzdXMgZG91YmxlcyAoWDJCKSBwZXIgZ2FtZS4NCg0KYGBge3IgcTgsIGVjaG89RkFMU0V9DQoNCnAgPC0gVGVhbXMgJT4lIA0KICAgICBmaWx0ZXIoeWVhcklEICVpbiUgMTk2MToyMDAxKSAlPiUNCiAgICAgbXV0YXRlKHRyaXBsZXNfcGVyX2dhbWUgPSBYM0IvRywgZG91Ymxlc19wZXJfZ2FtZSA9IFgyQi9HKSAlPiUNCiAgICAgZ2dwbG90KGFlcyh0cmlwbGVzX3Blcl9nYW1lLCBkb3VibGVzX3Blcl9nYW1lKSkgKw0KICAgICBnZW9tX3BvaW50KGFscGhhID0gMC41KSArDQogICAgIGxhYnMoeCA9ICdUcmlwbGVzIHBlciBnYW1lJywgeSA9ICdEb3VibGVzIHBlciBnYW1lJykgKw0KICAgICBnZW9tX3Ntb290aChtZXRob2QgPSBsbSwgY29sb3IgPSAicmVkIiwgc2UgPSBGQUxTRSkNCmdncGxvdGx5KHApDQpgYGANCg0KDQo=